Efficient client-server based implementations of mobile speech recognition services
نویسندگان
چکیده
The purpose of this paper is to demonstrate the efficiencies that can be achieved when automatic speech recognition (ASR) applications are provided to large user populations using client-server implementations of interactive voice services. It is shown that, through proper design of a client-server framework, excellent overall system performance can be obtained with minimal demands on the computing resources that are allocated to ASR. System performance is considered in the paper in terms of both ASR speed and accuracy in multi-user scenarios. An ASR resource allocation strategy is presented that maintains sub-second average speech recognition response latencies observed by users even as the number of concurrent users exceeds the available number of ASR servers by more than an order of magnitude. An architecture for unsupervised estimation of user-specific feature space adaptation and normalization algorithms is also described and evaluated. Significant reductions in ASR word error rate were obtained by applying these techniques to utterances collected from users of hand-held mobile devices. These results are important because, while there is a large body of work addressing the speed and accuracy of individual ASR decoders, there has been very little effort applied to dealing with the same issues when a large number of ASR decoders are used in multi-user scenarios. Preprint submitted to Elsevier Science 5 May 2006
منابع مشابه
Automatic Speech Recognition on Mobile Communication Networks
As mobile devices become pervasive and small, the design of efficient user interfaces is rapidly developing into a major issue. The expectation for speech-centric interfaces has stimulated a great interest in deploying automatic speech recognition (ASR) on devices like mobile phones, PDAs and automobiles. Mobile devices are characterised as having limited computational power, memory size and ba...
متن کاملDevelopment of client-server speech translation system on a multi-lingual speech communication platform
This paper describes a client-server speech-to-speech translation system developed on a multi-lingual speech communication platform. This platform enables easy assembly of speech communication system from the corresponding software modules (e.g. speech recognition, spoken language machine-translation, speech synthesis). This client-server speech translation system is designed for use at mobile ...
متن کاملRobust speech recognition in client-server scenarios
This paper addresses issues that are specific to the implementation of automatic speech recognition (ASR) applications and services in client-server scenarios. It is assumed in all of these scenarios that functionality in a human-machine dialog system is distributed between mobile client devices and network based multi-user media and application servers. It is argued that, while there has alrea...
متن کاملAcoustic Model and Language Model Adaptation for a Mobile Dictation Service
Automatic speech recognition is the machine-based method of converting speech to text. MobiDic is a mobile dictation service which uses a server-side speech recognition system to convert speech recorded on a mobile phone to readable and editable text notes. In this work, performance of the TKK speech recognition system has been evaluated on law-related speech recorded on a mobile phone with the...
متن کاملInternet Chinese information retrieval using unconstrained Mandarin speech queries based on a client-server architecture and a PAT-tree-based language model
In order to pursue high performance of Chinese information access on the Internet, this paper presents an attractive approach with a successful integration of efficient speech recognition and information retrieval techniques. A working system based on the proposed approach for speech retrieval of real-time Chinese netnews services has been implemented and tested. Very exciting performance has b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Speech Communication
دوره 48 شماره
صفحات -
تاریخ انتشار 2006